[None][perf] set ncclConfig graphUsageMode=1 on communicator init#13511
[None][perf] set ncclConfig graphUsageMode=1 on communicator init#13511nv-lschneider wants to merge 2 commits into
Conversation
25e1717 to
be0f60d
Compare
|
/bot run --add-multi-gpu --disable-fail-fast |
📝 WalkthroughWalkthroughNCCL communicator initialization in the common utilities is updated to use the Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~8 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
cpp/tensorrt_llm/common/opUtils.cpp (1)
2-2:⚠️ Potential issue | 🟠 MajorUpdate the NVIDIA copyright year in this modified file.
This file is modified in this PR, but the header still ends at 2024. Please update it to include 2026.
Proposed fix
- * SPDX-FileCopyrightText: Copyright (c) 2022-2024 NVIDIA CORPORATION & AFFILIATES. All rights reserved. + * SPDX-FileCopyrightText: Copyright (c) 2022-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.As per coding guidelines,
All TensorRT-LLM source files must contain an NVIDIA copyright header with the year of latest meaningful modificationandInclude NVIDIA copyright header on all new files; update year on modified files.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/tensorrt_llm/common/opUtils.cpp` at line 2, Update the NVIDIA copyright header in cpp/tensorrt_llm/common/opUtils.cpp to reflect the latest modification year by changing the trailing year range that currently ends with "2024" so it includes "2026" (i.e., update the header comment line containing "NVIDIA CORPORATION & AFFILIATES. All rights reserved." to show 2026 as the latest year).
🧹 Nitpick comments (1)
cpp/tensorrt_llm/common/opUtils.cpp (1)
166-166: Consider inline parameter comments for this NCCL call.The argument list is non-obvious; inline
/*paramName=*/comments would make this call easier to audit and maintain.Proposed refactor
- NCCLCHECK_THROW(ncclCommInitRankConfig(ncclComm.get(), group.size(), id, groupRank, &config)); + NCCLCHECK_THROW(ncclCommInitRankConfig( + /*comm=*/ncclComm.get(), /*nranks=*/group.size(), /*commId=*/id, /*myrank=*/groupRank, /*config=*/&config));As per coding guidelines,
In C++ function calls with non-obvious parameters, use inline C comments with the format /*paramName=*/ to document parameters.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/tensorrt_llm/common/opUtils.cpp` at line 166, The NCCL call NCCLCHECK_THROW(ncclCommInitRankConfig(ncclComm.get(), group.size(), id, groupRank, &config)); has non-obvious positional arguments; update this call to use inline parameter comments for clarity by annotating each argument with the parameter name using the /*paramName=*/ style (e.g., /*comm=*/ ncclComm.get(), /*nranks=*/ group.size(), /*uid=*/ id, /*rank=*/ groupRank, /*config=*/ &config) so reviewers can immediately see what each value maps to in ncclCommInitRankConfig.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@cpp/tensorrt_llm/common/opUtils.cpp`:
- Around line 164-165: Replace the magic literal 1 assigned to
config.graphUsageMode with a named C++ constant: add a file-local or header
constexpr int kGraphUsageMode = 1 (use camelCase with k prefix) and then set
config.graphUsageMode = kGraphUsageMode; update any nearby comments to reflect
the meaning if needed and use the same kGraphUsageMode symbol to make intent
explicit in ncclConfig_t config initialization code (refer to
config.graphUsageMode and NCCL_CONFIG_INITIALIZER).
---
Outside diff comments:
In `@cpp/tensorrt_llm/common/opUtils.cpp`:
- Line 2: Update the NVIDIA copyright header in
cpp/tensorrt_llm/common/opUtils.cpp to reflect the latest modification year by
changing the trailing year range that currently ends with "2024" so it includes
"2026" (i.e., update the header comment line containing "NVIDIA CORPORATION &
AFFILIATES. All rights reserved." to show 2026 as the latest year).
---
Nitpick comments:
In `@cpp/tensorrt_llm/common/opUtils.cpp`:
- Line 166: The NCCL call NCCLCHECK_THROW(ncclCommInitRankConfig(ncclComm.get(),
group.size(), id, groupRank, &config)); has non-obvious positional arguments;
update this call to use inline parameter comments for clarity by annotating each
argument with the parameter name using the /*paramName=*/ style (e.g., /*comm=*/
ncclComm.get(), /*nranks=*/ group.size(), /*uid=*/ id, /*rank=*/ groupRank,
/*config=*/ &config) so reviewers can immediately see what each value maps to in
ncclCommInitRankConfig.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: df33c482-8b74-4693-b35c-cef7af0d6132
📒 Files selected for processing (1)
cpp/tensorrt_llm/common/opUtils.cpp
|
PR_Github #45743 [ run ] triggered by Bot. Commit: |
|
PR_Github #45743 [ run ] completed with state |
|
/bot run --add-multi-gpu --disable-fail-fast |
|
PR_Github #45976 [ run ] triggered by Bot. Commit: |
|
PR_Github #45976 [ run ] completed with state |
be0f60d to
c501cb2
Compare
|
/bot run --add-multi-gpu --disable-fail-fast |
|
PR_Github #55329 [ run ] triggered by Bot. Commit: |
|
We might want to change this: I can do that tomorrow. |
|
PR_Github #55329 [ run ] completed with state
|
c501cb2 to
08e84e5
Compare
|
/bot run --add-multi-gpu --disable-fail-fast |
Signed-off-by: Ludwig Schneider <lschneider@nvidia.com>
Signed-off-by: Ludwig Schneider <lschneider@nvidia.com>
08e84e5 to
c2afce6
Compare
|
/bot run --add-multi-gpu --disable-fail-fast |
|
PR_Github #55575 [ run ] triggered by Bot. Commit: |
|
PR_Github #55575 [ run ] completed with state
|
|
/bot run --add-multi-gpu --disable-fail-fast |
|
PR_Github #55780 [ run ] triggered by Bot. Commit: |
Summary by CodeRabbit
Description
This sets the NCCL graph mixing support to a single graph.
https://docs.nvidia.com/deeplearning/nccl/user-guide/docs/api/types.html#c.ncclConfig_t.graphUsageMode
This eliminates an overhead before NCCL calls within captured graphs.
The risk is that this shifts the responsibility to not overlap NCCL operations within graphs, with either NCCL operations in other graphs or uncaptured calls.
From what I can see it is safe to do so today, but there is a risk to shift to this behavior.
Careful testing will be necessary.
Test Coverage
All kinds of parallel usage modes should be tested.
Starting with CI.
PR Checklist
GitHub Bot Help
To see a list of available CI bot commands, please comment
/bot help.